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Abstract 

Let F = Fp for any fixed prime p ^ 2. An affine-invariant property is a property of functions 
on F" that is closed under taking affine transformations of the domain. We prove that all 
affine-invariant properties that have local characterizations are testable. In fact, we give a 
proximity-oblivious test for any such property V, meaning that given an input function /, we 
make a constant number of queries to /, always accept if / satisfies V, and otherwise reject 
with probability larger than a positive number that depends only on the distance between / 
and "P. More generally, we show that any affine-invariant property that is closed under taking 
restrictions to subspaces and has bounded complexity is testable. 

We also prove that any property that can be described as the property of being decomposable 
into a known structure of low-degree polynomials is locally characterized and is, hence, testable. 
For example, whether a function is a product of two degree-c? polynomials, whether a function 
splits into a product of d linear polynomials, and whether a function has low rank are all 
examples of degree-structural properties and are therefore locally characterized. 

Our results use a new Gowers inverse theorem by Tao and Ziegler for low characteristic fields 
that decomposes any polynomial with large Gowers norm into a function of a small number of 
low-degree non- classical polynomials. We establish a new equidistribution result for high rank 
non-classical polynomials that drives the proofs of both the testability results and the local 
characterization of degree-structural properties. 



1 Introduction 

The field of property testing, as initiated by [BLR93, BFL91] and defined formally by [RS96, 
GGR98], is the study of algorithms that query their input a very small number of times and 
with high probability decide correctly whether their input satisfies a given property or is "far" 
from satisfying that property. A property is called testable, or sometimes strongly testable or locally 
testable, if the number of queries can be made independent of the size of the object without affecting 
the correctness probability. Perhaps surprisingly, it has been found that a large number of natural 
properties satisfy this strong requirement; see e.g. the surveys [Fis04, Rub06, Ron09, SudlO] for a 
general overview. 
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The focus of our work is on testing properties of multivariate functions over finite fields. Fix a 
prime p ^ 2 and an integer R ^ 2 throughout. Let F = ¥p. We consider properties of functions 
/ : F" ^ {1, . . . , R}. Our main result shows that any such property that is invariant with respect 
to affine transformations on F"' and that is locally characterized is testable. Furthermore, we show 
that a large class of natural algebraic properties whose query complexity had not been previously 
studied are locally characterized affine-invariant properties and are, hence, testable. Our results 
constitute an exact characterization of proximity-obliviously testable properties, the most common 
notion of testability considered for algebraic properties. In the rest of this section, we motivate and 
describe our results in more detail. 

1.1 Testability and Invariances 

Let [R] denote the set {1, . . . , R}. Given a property V of functions in {F*^ — )• | n € li^o}, we 
say that / : F" — )• [R] is e-far from V if 

and we say that it is e-close otherwise. 

Definition 1.1 (Testability). A property V is said to be testable (with one-sided error) if there 
are functions q : (0,1) — )■ Z>o, S : (0,1) (0,1), and an algorithm T that, given as input a 
parameter e > and oracle access to a function / : F" — >■ [i?], makes at most q{e) queries to the 
oracle for f , always accepts if f G V and rejects with probability at least 6{e) if f is e-far from V . 
If, furthermore, q is a constant function, then V is said to be proximity-obliviously testable (PO 
testable) . 

The term proximity-oblivious testing is coined by Goldreich and Ron in [GRll]. As an example 
of a testable (in fact, PO testable) property, let us recall the famous result by Blum, Luby and 
Rubinfeld [BLR93] which initiated this line of research. They showed that linearity of a function 
/ : F" — )• F is testable by a test which makes 3 queries. This test accepts if / is linear and rejects 
with probability ^{s) if / is e-far from linear. 

Linearity, in addition to being testable, is also an example of a linear-invariant property. We say 
that a property V C {F" — > [i?]} is linear-invariant if it is the case that for any / G "P and for any 
linear transformation L : F" — )• F", it holds that f o L £ V. Similarly, an affine-invariant property 
is closed under composition with affine transformations A : F" — ?> F" (an affine transformation A 
is of the form L + c where L is linear and c is a constant). The property of a function / : F" — )• F 
being affine is testable by a simple reduction to [BLR93], and is itself affine-invariant. Other well- 
studied examples of affine-invariant (and hence, linear-invariant) properties include Reed-Muller 
codes (in other words, bounded degree polynomials) [BFL91, BFLS91, FGL+96, RS96, AKK+05] 
and Fourier sparsity [GOS''~09]. In fact, affine invariance seems to be a common feature of most 
interesting properties that one would classify as "algebraic" . Kaufman and Sudan in [KS08] made 
explicit note of this phenomenon and initiated a general study of the testability of affine-invariant 
properties (see also [GKll]). In particular, they asked for necessary and sufficient conditions for 
the testability of affine-invariant properties. 
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1.2 Locally Characterized Properties 

The result summarized in the title of this paper gives a necessary and sufficient condition for affine- 
invariant properties to be PO testable. Let us first see why "local characterization" is a necessary 
condition for PO testability. 

For a PO testable property "P, if a function / does not satisfy V, then by Definition 1.1, the 
tester rejects / with positive probability. Since the test always accepts functions with the property, 
there must be q points xi, . . . ,Xq S F" that form a witness for non-membership in V. These are the 
queries that cause the tester to reject. Thus, denoting a = (/(xi), . . . , f{xq)) £ [R]'^, we say that 
C = {xi,X2, . . . , Xq] a) forms a q-local constraint for V. This means that whenever the constraint 
is violated by a function g, i.e., (g'(xi), . . . ,g{xq)) = a, we know that g is not in V. A property 
V is q-locally characterized if there exists a collection of g- local constraints Ci, . . . ,Cm such that 
g (z V and only if none of the constraints Ci, . . . ,Cm are violated. It follows from the above 
discussion that if V is PO testable with q queries, then V is (7-locally characterized. We say V is 
locally characterized if it is (/-locally characterized for some constant q. 

We now give some examples of locally characterized affine- invariant properties. Consider the 
property of being affine. It is 4-locally characterized because a function / is affine if and only if 
f{x) — f{x + y) — f{x + z) + f{x + y + z) = for every x,y,z £ Note that this characterization 
automatically suggests a 4-query test: pick random x,y,z € F" and check whether the identity 
holds or not for that choice of x, y, z. More generally, consider the property of being a polynomial 
of degree at most d, for some fixed integer d > 0. The property is known to be PO testable due to 
independent work of [KR06, JPRZ04], and their test is based upon a p' p-i -local characterization. 
Again, the test is simply to pick a random constraint and check if it is violated. 

Indeed, for any g-locally characterized property V defined by constraints Ci, . . . ,Cm, one can 
design the following q-query test: choose a constraint Ci uniformly at random and reject only if 
the input function violates Ci. Clearly, if the input function / is in V, the test always accepts. 
The question is the probability with which a function e-far from V is rejected. We show that for 
affine-invariant properties, this test always rejects with probability bounded away from zero for 
every constant e > 0. 

Theorem 1.2. Every q-locally characterized affine-invariant property is proximity- obliviously testable 
with q queries. 

1.3 Subspace Hereditary Properties 

Just as a necessary condition for PO testability is local characterization, one can formulate a natural 
condition that is (almost) necessary for testability in general. In the context of affine-invariant 
properties, the condition can be succinctly stated as follows: 

Definition 1.3 (Subspace hereditary properties). An affine-invariant property V is said to be 
(affine) subspace hereditary if for any / : F*^ — )• [R] satisfying V, the restriction of f to any affine 
subspace of¥^ also satisfies V. 

In [BGSIO], it is shown that every affine-invariant property testable by a "natural" tester is very 
"close" to a subspace hereditary property^. Thus, if we gloss over some technicalities, subspace 

^We omit the technical definitions of "natural" and "close", since they are unimportant here. Informally, the 
behavior of a "natural" tester is independent of the size of the domain and "close" means that the property deviates 
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hereditariness is a necessary condition for testability. In the opposite direction, [BGSIO] conjectures 
the following: 

Conjecture 1.4 ([BGSIO]). Every subspace hereditary property is testable. 

Resolving Conjecture 1.4 would yield a combinatorial characterization of the (natural) one- 
sided testable affine-invariant properties, similar to the characterization for testable dense graph 
properties [AS08a]. Although we are not yet able to confirm or refute the full Conjecture 1.4, 
we can show testability if we make an additional assumption of "bounded complexity", defined 
formally in Section 1.5.2. 

Theorem 1.5 (Informal). Every subspace hereditary property of "bounded complexity" is testable. 

We will formally define the notion of complexity later on in Section 1.5.2, but for now, it suffices 
to know that it is an integer that we will associate with each property (independent of n). Also, q- 
locally characterized properties are of complexity at most q. All natural affine-invariant properties 
that we know of have bounded complexity (in fact, most are locally characterized). So, the subspace 
hereditary properties not covered by Theorem 1.5 seem to be mainly of theoretical interest. 

1.4 Degree-structural Properties 

The conditions required in Theorem 1.2 and Theorem 1.5 are very general, and so, we expect that 
they are satisfied by many interesting algebraic properties. This, in fact, turns out to be the case. 
We show that a class of properties that we call degree- structural are all locally characterized and 
are, hence, testable by Theorem 1.2. We give the definition below in Definition 1.6. First let us 
list some examples of degree-structural properties. Let d be a fixed positive integer. Each of the 
following definitions defines a degree-structural property. 

• Degree ^ d: The degree of the function F : F" — )• F as a polynomial is at most d; 

• Splitting: A function F : F" ^ F splits if it can be written as a product of at most d linear 
functions; 

• Factorization: A function F : F" — > F factors if F = GH for polynomials G,H : ^ ¥ 
such that deg(G) ^ d — 1 and deg{H) ^ d — 1; 

• Sum of two products: A function F : F" — ?> F is a sum of two products if there are 
polynomials Gi, G2, G3, G4 such that F = G1G2+G3G4 and deg(Gi) ^ d—1 for i £ {1, 2, 3, 4}; 

• Having square root: A function F : F" — )• F has a square root ii F = for a polynomial 
G with deg(G) < d/2; 

• Low d-rank: for a fixed integer r > 0, a function F : F" — )• F has d-rank at most r if there 
exist polynomials Gi, . . . , G,- : F" ^ F of degree ^ d — 1 and a function F : F^ ^ F such that 
F = F(Gi,...,G,). 

In fact, roughly speaking, any property that can be described as the property of decomposing 
into a known structure of low-degree polynomials is degree-structural. 

from an actual afRne subspace hereditary property on functions over a finite domain. See [BGSIO] for details, or 
[AS08a] for the analogous definitions in a graph-theoretic context. 
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Definition 1.6 (Degree-structural property). Given an integer c > 0, a vector of non-negative 
integers d = (di, . . . , dc) G '^%q, and a function F : F*^ — F, define the (c, d, r)-structured property 
to be the collection of functions F : F" — > F for which there exist polynomials Pi, . . . , Pc : ¥^ ^ ¥ 
satisfying F{x) = T{Pi{x), . . . , Pc{x)) for all x S F" and deg(Pj) ^ di for all i £ [c]. 

We say a property V is degree-structural if there exist integers o", A > and a set of tuples 
S C {(c, d, r) I c G [cr], d G [0, A]'', T : F= F}, such that a function F : F" F satisfies V if and 
only if F is {c, d,r)-structured for some (c, d,r) G S. We call a the scope and A the max-degree 
of the degree- structural property V. 

It is straightforward to see that the examples above satisfy this definition. Our main result for 
degree-structural properties is the following: 

Theorem 1.7. Every degree- structural property with bounded scope and max-degree is a locally 
characterized affine-invariant property. 

Combining Theorem 1.7 with Theorem 1.2 implies PO testability for all degree-structural prop- 
erties. 

1.5 Formal version of the Main Result 

In this section, we describe our main result, Theorem 1.5, rigorously. Theorem 1.2 follows as a 
corollary. We first need to set up some notions. Just as a locally characterized property can be 
described by a list of constraints, subspace hereditary properties can also be described similarly, 
but here, the size of the list can be infinite. For affine-invariant properties, we can represent the 
constraints in a very special form, as "induced affine constraints". We first describe these, then 
define the notion of complexity, and finally state the theorem. 

1.5.1 AfRne constraints 

A linear form on k variables is a vector L = [wi,W2, . . . ,Wk) G F^ that is interpreted as a function 
from (F")'^ to F" via the map {xi, . . . ,Xk) i— )• vuixi -\- W2X2 -|- • • • -|- WkX^. A linear form L = 
{wi,W2, ■ ■ ■ ,Wk) is said to be affine if wi = 1. From now, linear forms will always be assumed to 
be affine. 

We specify a partial order ^ among affine forms. We say {vui, . . . , Wk) ^ {w'l, . . . , vu'/^) if \wi\^ 
\w'j^\ for all i G [k], where |-| is the obvious map from F to {0, 1, ... ,p — 1}. An affine constraint 
is a collection of affine forms, with the added technical restriction of being downward-closed with 
respect to ^. For future references we state this as the following definition. 

Definition 1.8 (Affine constraints). An affine constraint of size m on k variables is a tuple A = 
(Li, . . . , Lm) of m affine forms Li, . . . , L^a over ¥ on k variables, where: 

(i) Li{xi, ...,Xk)= xi; 

(a) If L belongs to A and L' ^ L, then L' also belongs to A. 

Any subspace hereditary property can be described using affine constraints and forbidden pat- 
terns, in the following way. 

Definition 1.9 (Properties defined by induced affine constraints). 
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• An induced afRne constraint of size m on i variables is a pair (A, a) where A is an affine 
constraint of size m on i variables and a G [i?]™. 

• Given such an induced affine constraint {A, a), a function / : F" ^ [R] is said to be {A,a)- 
free if there exist no xi, . . . ,X£ E F" such that (/(Li(xi, . . . , Xi)), . . . , f{Lm{xi, . . . , Xi))) = a. 
On the other hand, if such xi, . . . ,xi exist, we say that f induces {A, a) at xi, . . . ,xi. 

• Given a (possibly infinite) collection A = {{A^ , a^), {A^ , a'^), . . . , {A^ , a^), . . .} of induced 
affine constraints, a function / : F" — )• [R] is said to be ^-free if it is {A^,a^)-free for every 
i ^ 1. 

As an example consider the property of having degree at most 1 as a polynomial, for function 
F : F" — > F. It is easy to see that F satisfies this property if and only if F{xi) — F{xi + X2) — F{xi + 
X3) + F(xi +X2 + X3) = for all xi, X2, X3 € F. Consequently the property can be defined by the set 
of induced affine constraints that forbid any values for F(xi), F(xi +X2), F(xi + X3), F(xi +X2 + X3) 
that do not satisfy the identity F{xi) — F{xi + X2) — F{xi + X3) + F(xi + X2 + X3) = 0. 

The connection between affine subspace hereditariness and affine constraints is given by the 
following simple observation. 

Observation 1.10. An affine-invariant property V is subspace hereditary if and only if it is equiv- 
alent to the property of A-freeness for some fixed collection A of induced affine constraints. 

Proof. Given an affine invariant property V, a simple (though inefficient) way to obtain the set 
A is to let it be the following: For every n and a function / : F" ^ [R] that is not in V, 
we include in A the constraint {Af,af), where ^/ is indexed by members of F" and contains 
{L;,{Xi, Xn+i) = ^1 + Y17=i ^i^i+i ■■ z = {zi,...,Zn) £ F"}, and cj/ is just set to /. 

Setting Xi = and Xj+i to the ith standard vector Cj for every i € [n] shows that / is not 
{Af,af)-hee. Hence the property defined by A is contained in V. The containment in the other 
direction follows from V being affine-invariant and hereditary. 

The other direction of the observation is trivial. □ 

1.5.2 Complexity of linear forms 

Green and Tao, in their seminal work on arithmetic progressions in primes, introduced the following 
notion of complexity of linear forms. 

Definition 1.11 (Cauchy-Schwarz complexity, [GTIO]). Let C = {Li, . . . , Lm} be a set of linear 
forms. The (Cauchy-Schwarz) complexity of C is the minimal d such that the following holds. For 
every i € [m], we can partition {Lj}j^[m]\{i} into d+1 subsets such that Li does not belong to the 
linear span of any subset. 

li C = {Li, . . . , Lm} contains two linear forms that are multiples of each other (that is = XLj 
for i ^ j and A € F), then the complexity of C is infinity. Otherwise its complexity is at most 
2. Note that sets of affine linear forms are always of finite complexity. The following lemma 
can be proved using iterated applications of the classical Cauchy-Schwarz inequality. It explains 
the term "Cauchy-Schwarz complexity" , and illustrates its importance. 
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Lemma 1.12 (Counting Lemma, [GTIO]). Let fi, . . . , fm : F" [-1, 1]. Let C = {Li, . . . , L„,} 
be a system of m linear forms in (. variables of complexity d. Then: 

^ mm||/j||fjd+i. 

ie[m] 

Finally, given a collection A= {(^\ a^), (A^ ),..., a*)} of induced affine constraints, 
we say that A is of complexity ^ d if for each i, the collection of affine forms is of complexity 
^ d according to Definition 1.11. 



E 

xi,...,a;fgF" 



W_fi{Li{xi,. . . ,Xe)) 



i=l 



1.5.3 Statement of the main result 

Theorem 1.13 (Main theorem). For any integer d > and (possibly infinite) fixed collection 
A = {(^"^,0"^), (^^,(7^), (A*, o"*), . . . } of induced affine constraints, each of complexity ^ d, 
there are functions qj, : (0, 1) — )• Z+, d_A '■ (0, 1) — )• (0, 1) and a tester T which, for every e > 0, 
makes qj\_{£) queries, accepts A- free functions and rejects functions e- far from A- free with probability 
at least 5^(e). Moreover, qj[ is a constant if A is of finite size. 

We do not have any explicit bounds on (5_4 because the analysis depends on previous work 
based on ergodic theory. It would of course be interesting to have explicit bounds for some of the 
properties described in 1.2. 

Let us finally note that Theorem 1.2 is quite nontrivial even if A consists only of a single induced 
affine constraint of complexity greater than 1. Indeed, previously it was not known how to show 
testability in this case. A more detailed account of previous work is given in Section 1.7. 



1.6 Overview of Proofs 
1.6.1 Testability 

Let us now give an overview of our proof of Theorem 1.13. For simplicity of exposition, assume 
for now that A consists only of a single induced affine constraint {A, a) where A is the tuple of 
affine linear forms (Li, . . . ,Ljn), each over i variables, and a G [R]^. Let d be the complexity of 
the constraint. For i S [R], let Z*-*) : F" — )• {0,1} be the indicator function for the set f^^{{i})- 
Our goal will be to show that, when / is e-far from (A, a)-ficee, then: 



E 

Xl,...,Xl 



/('^i)(Li(xi, . . .,xe)) ■ /('^^)(L2(xi, . . .,xe)) • . . . • /('^-)(L^(xi, . . .,xe))] ^ <5(e), (1) 



for some function b : (0, 1) — )■ (0, 1). If Eq. (1) is true, then a valid test would be to simply pick i 
points uniformly at random and reject only if f{Li{xi, . . . , x^)) = ai, . . . , f{Lm{xi, . . . , xi)) = am- 
Studying averages of products, as in (1), has been crucial to a wide range of problems in additive 
combinatorics and analytic number theory. Szemeredi's theorem about the density of arithmetic 
progressions in subsets of the integers is a classic example. Szemeredi's work [Sze75] arguably 
initiated such questions in additive combinatorics, but the major development which led to a more 
systematic understanding of these averages was Gowers' definition of a new notion of uniformity 
in a Fourier-analytic proof for Szemeredi's theorem [GowOl]. In particular, Gowers introduced the 
Gowers norm \\-\\ud+i , which allows us to say the following about (1): If < f2, ■ ■ ■ , fm are 
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arbitrary functions that are bounded inside [—1,1], and Li, . . . , 
at most d, then 



are linear forms of complexity 



Xl, 



E 



J|/i(Li(xi, . . . ,Xf)) 



^ e. 



This observation leads to the study of decomposition theorems, that express an arbitrary function 
/ as a sum of two functions g and h, where g is "structured" in a sense we describe soon and h 
has low (d + l)-th order Gowers norm. Decomposing each /^"^'^ in this way into g^'^'-^ and h^'^^\ 
substituting into Eq. (1) and expanding, we get inside the expectation a sum of 2"* terms. All these 
terms, except one, contain some h^'^^^ in the product and can be bounded by the above mentioned 
property of the Gowers norm. In fact, we can make the Gowers norm small enough that we can 
effectively discard all these terms inside the expectation. The term remaining is the product of the 
"structured" functions. 



E 

xi,...,xe 



. . . , x,))g^^''\L2ixi, . . . ,x,)) . . . g^"'-\L^ixi, . . . ,x^)) 



(2) 



and the goal is to lower-bound this expectation. 

To describe the structure of g, let us go over how the decomposition into g and h is obtained. 
Given an arbitrary function /, if ||/||[/d+i is small, then we are already done. Otherwise, we 
repeatedly apply the Gowers inverse theorem to find a finite collection of polynomials Pi, ... , Pc 
of degree ^ d such that / = r(Pi, . . . ,Pc) + h, where ||/i||^d+i is small and T is some function. 
But there is a catch in this nice-looking structural theorem! li p > d, Pi, . . . , Pq are indeed 
"classical" degree-d F-valued polynomials over F*^. However, in our setting, where p is a fixed 
small constant, such a decomposition may no longer hold. Indeed, [GT09, LMS08] proved that if / 
equals the symmetric degree-4 polynomial and d = 3, we have an explicit counterexample to such a 
claim. Fortunately, Bergelson, Tao and Ziegler [BTZIO, TZIO, TZll] showed that it is possible to 
salvage the decomposition theorem by replacing "classical" F-valued polynomials by "non-classical" 
polynomials. These polynomials may take values in Zpk for some integer k. More precisely, a non- 
classical polynomial of degree d is a function P from F" to Z^fe such that the (d + l)-th order 
derivative of P is zero. The integer A; — 1 is called the "depth" of P. Classical polynomials have 
depth 0. 

We use the result of [TZll] to obtain non-classical polynomials Pi, ... , Pc of degree ^ d such 
that each g^'''^ = Ti{Pi, ...,Pc) for some function F^. We return now to the goal of lower-bounding 
Eq. (2). By a sequence of steps already introduced in [BGSIO] and [BFL12] (inspired by similar 
techniques on graph property testing in [AFKSOO, ASOSb, AS08a]), we reduce to the problem of 
lower-bounding the probability 



Pr 

Xl,...,Xi 



A Pi{Lj{xi,. . . ,Xi)) 



je[C],ieH 



where each hij is an arbitrary fixed element in the range of Pi. That is, we want to show that the 
polynomials {PioLj \ i S [C],j S [m]} behave like independent random variables distributed nearly 
uniformly on their range. Of course, this cannot be completely true. For example, if Pi is linear, 
Pi(xi-|-X2+X3)-Pi(xi+X2)--Pi(xi-|-X3)-|-Pj(xi) is identically zero and so, {Pj(xi+X2+X3), Pi(xi-|- 
X2),Pi(xi -|- X3), Pj(xi)} are correlated. Moreover, because the polynomials are non-classical, pP 
is a non-constant polynomial of lower degree than P and satisfies other identities not satisfied by 



8 



P itself. What we show is that if the cohection of polynomials Pi, ... , Pc is of high rank, then 
besides correlations which are forced by the degree and depth of the polynomials, there are no other 
dependencies. This equidistribution result for high rank non-classical polynomials is the technical 
crux of our work. Our proof technique is very different from the similar equidistribution claim in 
[HLlla, HLllb] for classical polynomials, since that proof uses the monomial structure of classical 
polynomials. 

Let us briefly describe what we mean by a high rank collection of non-classical polynomials 
Pi, ... , Pq. We say that the rank of the collection is r if there exist integers Ai, . . . , Ac such that 
AiPi, . . . , XcPc are not all identically zero but Yl'^=i ^i^i = . . . , Qr) for some r polynomials 

Qi, . . . ,Qr each of degree < maxj deg(AjPi) and some function F. So, if the rank of a collection of 
polynomials is high, that means that no linear combination of the polynomials, unless it is trivially 
zero, has an explanation in terms of a small number of lower degree polynomials. Intuitively, a 
high rank collection of degree d polynomials is like a random or generic collection of degree d 
polynomials. It does not have unexpected low-degree correlations, and it is robust to common 
operations such as taking projections or multiplying by constants or taking derivatives. 

This finishes the high-level overview of the proof, although there are some additional issues that 
we have swept under the rug. One problem is that the decomposition theorem actually decomposes 
a given function / to a sum of three functions /i , /2 j /a , not into two functions g and h as in 
the description above. The functions /i and /2 correspond to g and h, respectively, and /s is 
an additional function that has low L^-norm. Now, the closeness to equidistribution of the non- 
classical polynomials Pi, ... , Pc describing fi and the smallness of the Gowers norm for /2 can be 
made arbitrarily small as a function of C and are thus, essentially negligible for the purposes of the 
proof. On the other hand, the bound on the L^-norm for is only moderately small and cannot be 
made to decrease as a function of the complexity of the decomposition. To get around this issue, 
we use a sequence of two decompositions, and make the norm of fs decrease as a function of the 
size of the first decomposition. We hope that these iterated decomposition theorems (proved in a 
prequel [BFL12] to this paper) are of independent interest. 

1.6.2 Degree-Structural Properties 

Next, we give an overview of our proof of Theorem 1.7. For the sake of concreteness, let us focus 
on a particular degree-structural property, say, the property V of having a square root, as defined 
in Section 1.4. To show that V is locally characterized, we find a constant K = K(V) such that if a 
function P : F" — t- F does not have a square root, then there must exist a subspace H of dimension 
K such that F restricted to H also does not have a square root. 

So, suppose we are given a function P : F" ^ F such that n ^ K and every hyperplane 
restriction has a square root of degree ^ d/2. For large enough constant K, this automatically 
implies deg(P) ^ d. We first regularize f, meaning that we find polynomials Pi, ... , Pc of degree 
^ d such that Pi, ... , Pc are of high rank and F = r(Pi, . . . , Pc) for some function F. Note that 
here, just as in the proof of the testability result, we need to allow Pi, . . . ,Pc to be non-classical 
polynomials. Now, for some i such that P|a;,=o lias a square root, let P{, . . . , P^-, be the restrictions 
of Pi,...,Pc< to Xi = 0. So, F(P{,...,P^) = for some polynomial G. The polynomials 
P[, . . . , P'(j can be shown to be of high rank also. This implies that we can extend the collection of 
polynomials P{, . . . , P^ to P(, . . . , P^, Qi, ■ ■ ■ Qd such that the new collection is also of high rank 
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and G = A(P{,...,P^ 



Qi 



. . . ,Qd) for some function A. Hence 



T{P[,...,Pi.) = {A{P[,...,P^,Qi,. 



Qd)?. 



Because of the high rank of the coUection {P{, . . . , P^, Qi, . . . , Qd}) the equidistribution result 
described in the last section allows us to conclude that in fact: 



for all xi, . . . , xc,yi, . . . , yD in the ranges of P{, . . . , Pq, Qi, ■ ■ ■ ,Qd, respectively. Therefore, if we 
set G = A(Pi, . . . , Pc, 0, . . . , 0), then F = &. It is immediate^ that deg(G') < d/2, and so, F has 
a square root. 

It is curious that our proof of Theorem 1.7, which is entirely about classical polynomials, requires 
the use of non-classical polynomials. Also, as we mentioned earlier, there are no effective bounds on 
K(V) that arise from our argument. It would be interesting to obtain better bounds (both upper 
and lower) for the locality of degree-structural properties. 

1.7 Comparison with Previous Work 

This work is part of, and culminates a sequence of works investigating the relationship between 
affine-invariance and testability. As described, Kaufman and Sudan [KS08] initiated the pro- 
gram. Subsequently, Bhattacharyya, Chen, Sudan, andXie [BCSXll] investigated monotone linear- 
invariant properties of functions / : — > {0, 1}, where a property V is monotone if it satisfies the 
condition that for any function g & V, modifying g by changing some outputs from 1 to does 
not make it violate V. Krai, Serra and Vena [KSV12] and, independently, Shapira [Sha09] showed 
testability for any monotone linear-invariant property characterized by a finite number of linear 
constraints (of arbitrary complexity). For general non-monotone properties, Bhattacharyya, Grig- 
orescu, and Shapira proved in [BGSIO] that affine-invariant properties of functions in {¥2 — ^ {0, 1}} 
are testable if the complexity of the property is 1. Earlier this year, Bhattacharrya, Fischer and 
Lovett in [BFL12] generalized [BGSIO] to show that affine-invariant properties of complexity < p 
are testable. In this paper, we only have to restrict the complexity to be bounded, but the bound 
can be independent of p. 

In terms of techniques, the general framework of the proof for testability here is very much 
the same as in [BGSIO] or [BFL12]. However, the main difference here is that we work with 
collections of non-classical polynomials, rather than classical ones. Because the degrees of non- 
classical polynomials can change when multiplied by constants, the notions of rank and regularity 
are much more subtle. We need to establish a new version of a "polynomial regularity lemma" 
which allows us to decompose a given polynomial collection into a high rank collection of non- 
classical polynomials. Also, as discussed earlier, we establish a new equidistribution theorem for 
non-classical polynomials. We expect that these results will be of independent interest. 

At a high level, the argument to prove our main theorem mirrors ideas used in a sequence 
of works [AFKSOO, AS08b, AS08a, FN07, AFNS06, BCL+06] to characterize the testable graph 
properties. In particular, the technique of simultaneously decomposing the domain into a coarse 
partition and a fine partition with very strong regularity properties is due to [AFKSOO], and the 
compactness argument used to handle infinitely many constraints is due to [ASOBb]. 

^For other degree-structural properties, the degree bound may not be immediate at this last step, and we need to 
argue it separately, again using the equidistribution result for high-rank non-classical polynomials. 



r(xi, ...,Xc) = (A(xi, . . . , Xc, yi. 
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1.8 Organization 

In Section 2, we assemble all the technical components and establish some basic notions such as 
non-classical polynomials, rank and regularity. In Section 3, we show the equidistribution result for 
non-classical polynomials. In Section 4, we use the results established thus far to prove Theorem 1.7. 
Section 5 is devoted to proving Theorem 1.13. 

2 Preparation 

2.1 Notation 

For integers a, b, we let [a] denote the set {1,2,..., a} and [a, b] denote the set {a, a + 1, . . . ,b}. 
For a set S, V{S) denotes the power set of 5, 

Fix a prime field F = Fp for a prime p ^ 2. As we defined earlier |-| denotes the standard map 
from F to {0, 1, . . . ,p - 1} C Z. 

We use the shorthand x = a it e to mean a — e^x^a + e. 

2.2 Locality 

In the context of affine-invariant properties, we can define the notion of local characterization in a 
more algebraic way than we did in the introduction. Recall that a hyperplane is an affine subspace 
of CO dimension 1. 

Definition 2.1 (Locally characterized properties). An affine-invariant property V C {F" [R\ : 
n ^ 0} is said to be locally characterized if both of the following hold: 

• For every function / : F" ^ [B\ in V and every hyperplane A ^ F", /|yi€ V . 

• There exists a constant K ^ 1 such that if f : ^ [R] does not belong to V and n > K, 
then there exists a hyperplane B such that fls^V. 

The constant K is said to be the locality ofV. 

The following observation shows that an affine-invariant property is locally characterized if and 
only if it can be described using a bounded number of induced affine constraints from the previous 
section, and hence, is locally characterized in the sense of the introduction. 

Lemma 2.2. If V Q {F" [R] : n ^ 0} is a locally characterized affine-invariant property 
with locality K , then V is equivalent to A-freeness, where A is a finite collection of induced affine 
constraints, with each constraint of size p^ on K+1 variables. On the other hand, ifV is equivalent 
to A-freeness, where A is a collection of induced affine constraints with each constraint on ^ K + 1 
variables, then V has locality at most K. 

Finally, we also make formal note of the observation in the introduction that if a property is 
testable, then it must be locally characterized. 

Remark 2.3. If K is a fixed integer and V C {F" — >■ [i?]} is an affine-invariant property that is 
testable with K queries, then V is a locally characterized property with locality K . 

So, we can view our main result as a converse statement. 
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2.3 Derivatives and Polynomials 

Definition 2.4 (Multiplicative Derivative). Given a function / : — )■ C and an element h G F", 
define the multiplicative derivative in direction h of f to be the function A^f : F" — ?> C satisfying 
Ahf{x) = f{x + /i)7(x) for all x G F". 

The Gowers norm of order d for a function / is the expected multiplicative derivative of / in d 
random directions at a random point. 

Definition 2.5 (Gowers norm). Given a function / : F" ^ C and an integer d ^ 1, the Gowers 
norm of order d for f is given by 

1/2'* 



(7d = 



ni,...,h^,xGi''^ 



Note that as ||/||c/i= |E [/] | the Gowers norm of order 1 is only a semi- norm. However for 
(i > 1, it is not difficult to show that \\-\\ud is indeed a norm. 

If / = e^'^^^/P where P : F" — > F is a polynomial of degree < d, then ||/||c/d= 1- If d < p 
and ||/||oo^ 1, then in fact, the converse holds, meaning that any function / : F" ^ C satisfying 
ll/lloo^ 1 and ||/||[7d= 1 is of this form. But when d ^ p, the converse is no longer true. In 
order to characterize functions / : F" — )• C with ||/||oo^ 1 and ||/||t/d= 1, we define the notion of 
non- classical polynomials. 

Non-classical polynomials might not be necessarily F-valued. We need to introduce some nota- 
tion. Let T denote the circle group M/Z. This is an abelian group with group operation denoted 
+. For an integer k ^ 0, let denote prZ/Z, a subgroup of T. Let t : F — > Ui be the injection 

X ^-y mod 1, where \x\ is the standard map from F to {0, 1, . . . ,p — 1}. Let e : T ^ C denote 
the character e (x) = e^'^*^. 

Definition 2.6 (Additive Derivative). Given a function^ P : F*^ — )• T and an element h G F", 
define the additive derivative in direction h of f to be the function D^P : F" ^ T satisfying 
DhP{x) = P{x + h)- P{x) for all x € F". 

Definition 2.7 (Non-classical polynomials). For an integer d ^ 0, a function P : F" ^ T is said 
to be a non-classical polynomial of degree ^ d (or simply a polynomial of degree ^ d) if for all 
hi,. . . ,hii+i,x G F", it holds that 

{DH,---Dh,^^P){x)=0. (3) 

The degree of P is the smallest d for which the above holds. A function P : F" ^ T is said to be a 
classical polynomial of degree ^ d if it is a non-classical polynomial of degree ^ d whose image is 
contained in 

It is a direct consequence that a function / : F" ^ C with ||/||oo^ 1 satisfies ||/||j/d+i= 1 if and 
only if / = e (P) for a (non-classical) polynomial P : F" ^ T of degree ^ d. 

The following lemma of Tao and Ziegler shows that a classical polynomial P of degree d must 
always be of the form to Q, where Q : F" — )• F is a polynomial (in the usual sense) of degree d. It 
also characterizes the structure of non-classical polynomials. 



■^We try to adhere to the following convention: upper-case letters (e.g. F and P) to denote functions mapping 
from F" to T or to F, lower-case letters (e.g. / and g) to denote functions mapping from F" to C, and upper-case 
Greek letters (e.g. F and E) to denote functions mapping T'^ to T. By abuse of notation, we sometimes conflate F 
and t(F). 



12 



Lemma 2.8 (Part of Lemma 1.7 in [TZll]). Let d ^ 1 be an integer. 

(i) A function P : F" — )• T is a polynomial of degree ^ d + 1 if and only if D^P is a polynomial 
of degree ^ d for all h G F" . 

(a) A function P : F" ^ T is a classical polynomial of degree ^difP = LoQ, where Q : F" — t- F 
has a representation of the form 



Q{xi, . . . , Xn) — ^ ^ Cc;i,...,(in2;]^ 



for a unique choice of coefficients Cd^^....dn ^ ^■ 
(Hi) A function P : F" — )• T is a polynomial of degree ^ d if and only if P can he represented as 

P{xi,. . . ,Xn) = a+ 2^ modi, 

0!Sdi,...,d„<p:fc^O: ^ 
0<E,; di<id-k(p-l) 

for a unique choice of C(i-^,,,,^dn,k S {0, 1, ... ,p — 1} and a G T. The element a is called the 
shift of P, and the largest integer k such that there exist di, . . . ,d„ for which Crf^,...,d„,A; 
is called the depth of P. Classical polynomials correspond to polynomials with shift and 
depth. 

(iv) If P : W"' T is a polynomial of depth k, then it takes values in a coset of the subgroup iJk+i- 
In particular, a polynomial of degree ^ d takes on at most p^p-'^^ distinct values. 

Note that Lemma 2.8 (iii) immediately imphes the fohowing important observation^ : 

Remark 2.9. If Q -.F^ ^ T is a polynomial of degree d and depth k, then pQ is a polynomial of 
degree ma,x{d — p + 1,0) and depth k — 1. In other words, if Q is classical, then pQ vanishes, and 
otherwise, its degree decreases by p—1 and its depth by 1. Also, if X G [^,p — 1] is an integer, then 
deg{XQ) = d and depth(A(5) = k. 

Also, for convenience of exposition, we will assume throughout this paper that the shifts of all 
polynomials are zero. This can be done without affecting any of the results in this work. Hence, 
all polynomials of depth k take values in U^+i. 



2.4 Inverse Theorem 

There is a tight connection between polynomials and Gowers norms. In one direction, it is a straight- 
forward consequence of the monotonicity of the Gowers norm (||/||(/d^ ll/llc/d+i) ^'^^ invariance of 
the Gowers norm with respect to modulation by lower degree polynomials (||/||jjd+i= ||/-e {P)\\jjd+i 
for polynomials P of degree ^ d) that if / is 5-correlated with a polynomial P of degree ^ d, meaning 

|E/(x)e(-P(x))|^<5 

X 

^Recall that T is an additive group. If n G Z and x G T, then nx is shorthand for x + ■ ■ ■ + x ii n ^ and 

n terms 

—X — ■ ■ ■ — X otherwise. 

— n terms 
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for some 5 > 0, then 

In the other direction, we have the following "Inverse theorem for the Gowers norm". 

Theorem 2.10 (Theorem 1.11 of [TZll]). Suppose 6 > and d ^ 1 is an integer. There exists 
an e = £2.1o('^''^) -s^c/i that the following holds. For every function / : F" — > C with ||/||oo^ 1 
and \\f\\jjd+i^ 5, there exists a polynomial P : F" ^ T o/ degree ^ d that is e-correlated with f , 
meaning 



E /(x)e(-P(x)) 



> e. 



2.5 Rank 



We will often need to study Gowers norms of exponentials of polynomials. As we describe below if 
this analytic quantity is non-negligible, then there is an algebraic explanation for this: it is possible 
to decompose the polynomial as a function of a constant number of low-degree polynomials. To 
state this rigorously, let us define the notion of rank of a polynomial. 

Definition 2.11 (Rank of a polynomial). Given a polynomial P : F" — t- T and an integer d > \, 
the d-rank of P, denoted rar\k^{P), is defined to be the smallest integer r such that there exist 
polynomials Qi, . . . , Qr : F** — )• T o/ degree ^ d — 1 and a function F : — )• T satisfying P{x) = 
T{Qi{x), . . . , Qr{x)). If d = 1, then 1-rank is defined to be co if P is non-constant and otherwise. 
The rank of a polynomial P : F" ^ T is its deg{P)-rank. 

Note that for integer A € [l,p — 1], rank(P) = rank(AP). The following theorem of Tao and 
Ziegler shows that high rank polynomials have small Gowers norms. 

Theorem 2.12 (Theorem 1.20 of [TZll]). For any e > and integer d > 0, there exists an integer 
r = r2 i2{d,e) such that the following is true. For any polynomial P : F" — )■ T of degree ^ d, if 
||e(P)||^d^ e, then rankd(P) ^ r. 

For future use, we also record here a simple lemma stating that restrictions of high rank poly- 
nomials to hyperplanes generally preserve degree and high rank. 

Lemma 2.13. Suppose P : F" — t- T is a polynomial of degree d and rank ^ r, where r > p+1. Let 
A be a hyperplane in F", and denote by P' the restriction of P to A. Then, P' is a polynomial of 
degree d and rank r — p, unless d = 1 and P is constant on A. 

Proof. For the case d = 1, we can check directly that either P' is constant or else, P' is a non- 
constant degree-1 polynomial and so has rank infinity. 

So, assume d> 1. By making an affine transformation, we can assume without loss of generality 
that A is the hyperplane {xi = 0}. Let vr : F" ^ F""^ be the projection to A. Let P" = P-P'ovr. 
Clearly, P" is zero on A. For x e F \ {0}, let /i^. = (x, 0, . . . , 0) G F". Note that Dh^P" is of degree 
^ d - 1 and that {Dh,P"){y) = P"{y + h^) for all y & A. Hence, for every x G F \ {0}, P" on 
hx + A agrees with a polynomial Qx of degree ^ d — 1. So, for a function F : T^+i — T, we can 
write P = F(t(xi), P', Qi, Q2, . . . , Qp-i), where i(xi), Qi, . . . , Qp-i are of degree ^ d — 1. 

Now, if P' itself is of degree d — 1, then P is of rank ^ p -|- 1 < r, a contradiction. If P' is of 
rank < r — p, then again P is of rank <r— p + p = r,a contradiction. □ 
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2.6 Polynomial factors 



A high-rank polynomial of degree d is, intuitively, a "generic" degree-d polynomial. There are no 
unexpected ways to decompose it into lower degree polynomials, and the property of high rank is 
robust against various operations such as restrictions to hyperplanes, taking derivatives, multiplying 
by integers, etc. Next, we will formalize the notion of a generic collection of polynomials. Intuitively, 
it should mean that there are no unexpected algebraic dependencies among the polynomials. First, 
we need to set up some notation. 

Definition 2.14 (Factors). . If X is a finite set then by a factor B we mean simply a partition of 
X into finitely many pieces called atoms. 

A function f : X ^ C is called B-measurahle if it is constant on atoms of B. For any function 
/ : X ^ C, we may define the conditional expectation 



where B{x) is the unique atom in B that contains x. Note that E[/|;B] is ^?-measurable. 

A finite collection of functions 0i , . . . , (pc from X to some other space Y naturally define a factor 
B = whose atoms are sets of the form {x : {(f>i{x), . . . ,(j)c{x)) = {yi, . . . ,yc)} for some 

{yi, . . . ,yc) € . By an abuse of notation we also use B to denote the map x i-^ {(pi{x), . . . ,4>c{x)), 
thus also identifying the atom containing x with {(j)i{x), . . . , (j)c{x)). 

Definition 2.15 (Polynomial factors). If Pi, ■ ■ ■ ,Pc : ^ T zs a sequence of polynomials, then 
the factor Bpj^^,,,^p^ is called a polynomial factor. 

The complexity of B, denoted \B\, is the number of defining polynomials C. The degree of B 
is the maximum degree among its defining polynomials Pi, ... , Pc- If Pi, ... , Pc are of depths 
ki, . . . , kc, respectively, then ||-B||= 11^=1 p'^'"''^ is called the order of B. 

Notice that the number of atoms of B is bounded by The rank of a factor can now be 

defined as follows. 

Definition 2.16 (Rank and Regularity). A polynomial factor B defined by a sequence of polynomi- 
als Pi, ... , Pc : F" ^ T with respective depths ki, . . . ,kc is said to have rank r if r is the least inte- 
ger for which there exist (Ai, . . . , Ac) € so that (Ai mod p'^i+i, . . . , Ac mod p*^c-+i) (0, . . . , 0) 
and the polynomial Q = Ym=i ^i^i satisfies rankrf((5) ^ r where d = maxj deg(AjPj). 

Given a polynomial factor B and a function r : Z>o ^>o, we say B is r-regular if B is of 
rank larger than r{\B\). 

Note that since A can be a multiple of p, rank measured with respect to deg(AP) is not the 
same as rank measured with respect to deg(P). So, for instance, if B is the factor defined by a 
single polynomial P of degree d and depth k, then 



Regular factors indeed do behave like a generic collection of polynomials, as we shall establish 
in a precise sense in Section 3. Thus, given any factor B that is not regular, it will often be useful 
to regularize B, that is, find a refinement B' of B that is regular up to our desires. We distinguish 
between two kinds of refinements: 



E[/|^](x) 




rank(S) = min i rankrf(P), rankrf_(p_i)(pP), • • • 
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Definition 2.17 (Semantic and syntactic refinements). B' is called a syntactic refinement of B, 
and denoted B' ^syn B, if the sequence of polynomials defining B' extends that of B. It is called a 
semantic refinement, and denoted B' ^sem B if the induced partition is a combinatorial refinement 
of the partition induced by B. In other words, if for every x,y G B'{x) = B'{y) implies 
B{x) = B{y). 

Remark 2.18. Clearly, being a syntactic refinement is stronger than being a semantic refinement. 
But observe that if B' is a semantic refinement of B, then there exists a syntactic refinement B" of 
B that induces the same partition o/F", and for which |i3'| + |0|, because we can define B" 

by just adding the defining polynomials of B to those of B' . 

The fofiowing lemma is the workhorse that allows us to construct regular refinements. 

Lemma 2.19 (Polynomial Regularity Lemma). Let r : Z>o — )• Z>o be a non- decreasing function 

(r d] 

and d > be an integer. Then, there is a function C2 '-^g ■ Z>o ^>o such that the following is 
true. Suppose B is a factor defined by polynomials Pi, ... , Pc : F" — )• T 0/ degree at most d. Then, 
there is an r -regular factor B' consisting of polynomials Qi, . . . , Qc' : F" — )■ T 0/ degree ^ d such 
that B' hsem B and C ^ cJ'j^^(C). 

Moreover, if B is itself a refinement of some B that has rank > {r{C') + C) and consists of 
polynomials, then additionally B' will be a syntactic refinement of B. 

Proof. We can prove our lemma starting from Lemma 9.6 of [TZll]. To explain, let us define the 
notion of an extended factor. We say a polynomial factor B is extended if for any polynomial Q £ B 
that is not classical, pQ G B also. Note that an extended factor defined by polynomials Pi, ... , Pc 
is of high rank if for all tuples (Ai, . . . , Ac) G [0,p — 1]*"^, unless all the Aj's are zero, AjPi is of 
high (maxj deg(AjPj))-rank. Tao and Ziegler proved the following: 

Lemma 2.20 (Lemma 9.6 of [TZll]). Let r : Z>o — s- Z>o be a non- decreasing function and 
d > be an integer. Then, there are functions C'(''^) : Z>o Z>o and I^'^^ : Z>o — s- Z>o such 
that the following is true. Suppose B is an extended polynomial factor defined by polynomials 
Pi, ■ ■ ■ ,Pc : F" — >■ T 0/ degree ^ d. Then, there is a subspace ^ F" and an r -regular extended 
factor B consisting of polynomials Qi, . . . , Qq : V ^ T such that 2 ^ deg{Qi) ^ d for each i, B 
semantically refines the factor defined by Pi\v,. ■ ■ ,Pc\v! C ^ C^'^''^\C), and dim(y) ^ n — I^'^\C). 

Let i3i be the extended factor defined by {p'^Pj | ^ A; ^ depth(Pj),z G [C]}. Apply Lemma 2.20 
to Bi in order to obtain a bounded index subspace Vi and an extended Pi-regular factor Bi defined 
by polynomials Qi, . . . , : Vi ^ T, where Pi is a growth function (growing even faster than r) we 
specify later on in the proof and C ^ C^^^^'^WBi\). For a G F"/Vi and P G Bi, define P" : Fl ^ F" 
to be P"(x) = DaP{x) = P{a + x) — P(x). Each P" is of degree ^ d—1. Also, since Vi is the inter- 
section of / ^ I^'^^ (C) hyperplanes, we can decide which coset in F"/Vi an element x G F" belongs 
to as a function of / ^ I^'^^ {C) (classical) linear functions vri, . . . , vr/. Let B2 be the extended factor 
obtained by adding to Bi all the polynomials {P°- \ P G Bi,a € W^/Vi} and vri, . . . ,7r/. Consider 
X G F" and let x = a + y where y (zVi, and a G F^/Vi. Since P(x) = P(y) — P~"'{x), each polyno- 
mial in Bi is a function of the polynomials in B2 over all of F", and so B2 is a semantic refinement of 
Bi (and a syntactic refinement of ^1). Note that l^zK C + dC7/W((7) + /W((7) < (7 + 2(i(7/W((7). 

Now, suppose we repeat the steps in the previous paragraph with B2 taking the place of Bi 
and a different function R2 taking the place of Pi. We specify R2 later, but we will choose it 
so that it grows faster than r. The new application of Lemma 2.20 to B2 produces an extended 
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factor B2 that is i?2-i'egular and a bounded index subspace V2 such that the polynomials in B2 
restricted to V2 are measurable with respect to B2- We argue that B2 differs from B2 only by 
polynomials of degree ^ d — 1. Suppose B2 is not i22-regular to start off with. The function Ri 
is chosen so that Bis rank, Ri{\Bi\) > R2{\B2\) + \B2\- This means that if a linear combination 
of polynomials in B2, Yls€B2 ^^^^ rank ^ R2{\B2\) and d' = maxs.;^g^o deg(S'), then there 
must be an S" Bi with A5 ^ and degree d', since otherwise the rank condition of Bi would be 
violated. Since all the polynomials in B2 which are not in Bi have degree ^ d — 1, we conclude 
that d' ^ d — 1. Inspecting the proof of Lemma 2.20 in [TZll] shows that this means B2 consists of 
the polynomials of Bi along with other polynomials of degree ^ d — 1. In the same way as in the 
previous paragraph, we obtain an extended factor B3 ^syn S2, so that B3 is a semantic refinement 
of B2 over all of F". Note that since all the polynomials of Bi are already in B2, we only need 
to add jP"^ I P S S2 \ -Si, a G F"/V2}, together with some linear functions. All these polynomials 
have degree at most ^ d — 2. 

We keep repeating this process to obtain a sequence of extended factors Bi,B2,B3, . . . and 
Bi,B2,B3, .... Each Bi+i semantically refines Bi and syntactically refines Bi. The process stops 
at step i if Bi becomes Pj-regular, where the sequence of growth functions Ri satisfies Ri{m) > 
Ri+i{m + 2dml^'^\m)) + m + 2dml^'^\m) and Rdim) = r{m). The functions Ri are chosen so that 
Ri{\Bi\) > Pj+i(|;Bi+i I) + and therefore, by the above argument, Bi+i differs from Bi by 

polynomials of degree ^ d — i. So, we must stop after obtaining Bd in the sequence. Also, since 
each Ri grows faster than r, note that i2j-regularity for any i € [d] implies r-regularity. So, it must 
be that some Bi for i ^ d already becomes r-regular. 

Given an extended factor B" of rank > r, we can get a (standard) factor B' of rank > r by letting 
B' be defined by the smallest subset of polynomials S such that {p*P \ P € S,i G Z^o} 5 S"- The 
last statement of the lemma follows from the same considerations as used above to argue that Bi 
syntactically refines Bi. □ 



3 Equidistribution of Regular Factors 

In this section, we make precise the intuition that a high-rank collection of polynomials often 
behaves like a collection of independent random variables. The key technical tool is the connection 
between the combinatorial notion of rank and the analytic notion of bias, given in Theorem 2.12. 
A weaker statement, that was established earlier by Kaufman and Lovett^ and used by Tao and 
Ziegler in their proof of Theorem 2.12, is the following. 

Theorem 3.1 (Theorem 4 of [KL08]). For any e > and integer d > 0, there exists r = r^ lid,^) 
such that the following is true. If P : ¥^ ^ T is a degree-d polynomial with rank greater than r, 
then |E^[e(P(x))]|< e. 

Proof. Given Theorem 2.12, this follows directly from easy fact that |E[/]| ^ ||/||[/<i for every 
d ^ 2, and every / : F" ^ C. □ 

Using a standard observation that relates the bias of a function to its distribution on its range, 
we can conclude the following. 

^Kaufman and Lovett proved Theorem 3.1 for classical polynomials. But their proof also works for non-classical 
ones without modification. 



17 



Lemma 3.2 (Size of atoms). Given e > 0, let B be a polynomial factor of degree d > 0, complexity 
C, and rank r^ i(d,e), defined by a tuple of polynomials Pi, ... , Pc : F" ^ T having respective 
depths ki, . . . ,kc- Suppose b = . . . , be) € Uki+i x • • • x Vkc+i- Then 

1 



Pr[B{x) = b] 



\B\ 



±e. 



In particular, for e < B{x) attains every possible value in its range and thus has \\B\\ atoms. 
Proof. 



Pr[B{x) = 6] = E 



p'^i^j-— 1 



Ai=0 



E 



(Ai,....Ac) 

6ni[o.p*'+^-ii 



e(^\,iP,{x)-b,)j 



±£. 



The first equahty uses the fact that Pi{x) — bi is in Ufc^+i and that for any nonzero x £ Ufc.+i, 

Yl'x=o~ 6 (-^2;) = 0. The third equality uses Theorem 3.1 and the fact that unless every Aj = 0, 

□ 

(P,(x)) but 



the polynomial Xi{Pi{x) — bi) has rank at least r^ i{d,e). 

For our applications, we need to not only understand the distribution of B{x) 
also, more generally, {Pi{Lj{x))) for a given sequence of linear forms Li, 
this end, we first show the following dichotomy theorem. 



(¥' 



F". To 



Theorem 3.3 (Near orthogonality). Given e > 0, suppose B = {Pi, . . . ,Pc) is a polynomial factor 
of degree d> and rank > r2 i2{d,£), A = (Li, . . . ,Lm) is an affine constraint on I variables, and 
A is a tuple of integers {^i,j)i(^[c],je[m]- Define 

Pa,b,a{xi,. . . ,Xi) = ^ XijPi{Lj{xi, . . . ,Xi)). 

i€[C],je[m] 

Then, one of the two statements below is true. 

• For every i G [C], it holds that Y2je[-m] "^ijQii^ji')) = f'^''" polynomials : F" ^ T with 
the same degree and depth as Pi. Clearly, Pa,b,a = in this case. 

• Pa,b,a is non-constant. Moreover, \Eix-i^^...^xe['siPA,B,Aixi, ■ ■ ■ ,xe))]\< e. 

Proof. If Xij 7^ 0, then XijPi can be assumed to be non-constant, since otherwise, we can set Xij to 
0. Let the depths of Pi, ... , Pc he ki, . . . ,kc respectively. For each j £ [m], we let (wj^i, . . . , Wj^i) 
denote the vector corresponding to the affine form Lj; recall that Wj^i = 1. For any affine form Lj, 
let \Lj\, its weight, denote the sum '^l=2\'^j,t\- 

For each i, perform the following step independently. If there exists a j such that |ij|> 
deg(AjjPj) and Xij 7^ 0, then use Eq. (3) to replace XijPi{Lj{-)) by a linear combination over 
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Z of Pi{Lj/{-)) with Lj/ :< Lj, and repeat until no such j exists. Here we use the assumption in 
part (ii) of Definition 1.8 that such Lj/ € A. At the end of this process, we obtain a new tuple of 
coefficients A' = (A- j); we can assume that each X'- ^ G [0,^^'"''-'^ — 1] after quotienting with p^'+^Z. 

If all the X'^ j are zero, then for the original coefficients (Aj j) also, XijPi{Lj{-)) is iden- 

tically zero for every i individually. Indeed, J2je[m] ^ijQii^ji')) is zero for any Qi with the same 
degree and depth as Pi because the above transformation from A to A' only depended upon the 
degree and depth of Pi . 

Otherwise, A' does not consist of all zeroes, and for every nonzero A^ we have deg(A^ j^i)- 

In this case we show that |E[e {Pa,b,A'{xij ■ ■ ■ )^^))]|< £• At a high level, our goal is to express 
the bias of Pa.b.A' in terms of the Gowers norm of a linear combination of Pj's and then use 
Theorem 2.12. 

Suppose without loss of generality that the form Li satisfies: 

(i) A - 1 / for some i G [C]. 

(ii) Li is maximal in the sense that for every j ^ 1, either X'-j = for all i € [C] or it is the case 
that |ifj,t|< \wi^t\ for some t € [i]. 

We want to "derive" Pa,b,A' until we kill all Pi{Lj{-)) terms for j > 1. Given a vector a = 
(qi, . . . , ai) G F^, an element y € F", and a function P : (F")^ — )• T, let us define 

Da,yP{xi, ...,xe) = P{xi + aiy, ...,xe + a^y) - P{xi, ...,Xi). 

Note that 

Da,y{Pi o Lj){xi, ...,Xi) = Pi{Lj{xi, ...,xe) + Lj{a)y) - Pi{Lj{xi, . . . ,Xi)) 

= {DL,ia)yPi){Lj{xi, . . ■,Xi)). 

Thus, if Lj{cx) = {Lj, a) = 0, then Da^yPi o Lj = for all choices of y. 

Set A = \Li\= X^i=2^i,«' ^^"^ q;i,...,q;a G F^ be the set of all vectors of the form 
{—w,0, ... ,0,1,0, ... ,0) where 1 is in the ith coordinate for i G [2,^] and ^ ^ — 1 is 
an integer. Note that that {Li,cxk) 7^ for all k G [A], but for any j > 1, by maximality of Li, 
there exists some k G [A] such that (Lj,Qfc) = 0. Consequently, 

(- - ^ I \ 

C 

= (-^(Li,aA>yA ' ' ' P'{Li,ai)yi A • ^Pj) (Li (xi , . . . , X()). 



Therefore 

vi,^,y^, ((^"A,yA • • • Dc^.y^PA,B,A'){xi, Xi))] 

X-^,...,X£ 

On the other hand we have the following claim. 



i=l 



C 



i=l 



(4) 
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Claim 3.4. 

„ E [e((Z)c,A,?/A---^ai,s/i-PA,B,A')(a^i>--->2;<?))] ^ ( E e {PA,B,A'ixi, ■ ■ ■ ,xe)) 
Proof. It suffices to sliow tliat for any function P{xi, . . . , x^) and nonzero a G F^, 



E [e{{Do^,yP)ixi,...,xe))] 

y,xi,...,Xi 



E [e(P(xi,...,x,))] 

xi,...,xe 



Recall that {Da,yP){xi, . . . ,xe) = P{xi + aiy, . . . ,X£ + am) — P{xi, . . . ,Xi). Without loss of 
generality, suppose ai / 0. We make a change of coordinates so that a can be assumed to be 
(1, 0, . . . , 0). More precisely, define P' : (F")*^ ^ T as 

, \ f X2+a2Xi xs + a^xi x^ + a^xA 
P (Xl,. . .,Xi) = P[ xi, , , . . . , I , 



so that -P(xi, . . . , xi) = P'(xi, QiX2— 02X1, Q!iX3— 03x1, . . . , aiX£— Q!£Xi), and thus {Da,yP){xi, . . . , x^) 
P'{xi + aiy, a\X2 — 02X1, . . . , aixi — Q^xi) — P'(xi, 01X2 — 02X1, . . . , aixi — a^xi). Therefore 

E [e((^«,j,P)(xi,...,x,))] 

y,xi,...,xe 

E [e (P'{xi + aiy, 01X2 — a2Xi, . . . , aiXi — aexi) — P'{xi, a\X2 — a2Xi, . . . , aixg — aexi))] 

y,xi,...,xe 



E [e (P'(xi + aiy,X2, ... ,xi) - P'(xi,X2, . . . ,Xi))] 

y,xi,...,xi 



E 

X2,---,Xi 



E[e (P'(xi,X2,...,x^))] 



E [e(P'(xi,X2,...,x^))] 



E [e{P{xi,X2,...,xe))] 

Xl,X2,...,Xl 



□ 



Therefore, combining Eq. (4) with Claim 3.4, we get: 



ielC] 



E e (Pa,b,A'(2^i> • • • >a:^^)) 



Xl,...^X 



Suppose \^xi,...,xi e {Pa,b,A'{xi, . . . , x^)) |^ e. Then, by the above inequality and Theorem 2.12, 
we get that ^p.gg ]^Pi(x) is a function of r = r2 i2{d,e) polynomials of degree A — 1. But 
recall that if 7^ 0, then deg(A^^Pj) ^ \Li\= A. Also, there exists a nonzero A^^^. This is a 
contradiction to our assumption that the factor B is of rank > r2 i2{d,s). □ 

Remark 3.5. The proof of Theorem 3.3 also shows the following. Suppose, in the setting of 
Theorem 3.3, that for every Pi £ B and Lj E A, either \Lj\^ deg(AijPi) or Xij = 0. Then, unless 
every Ajj = (mod p^^^^), we have that Pa,b,a is non-constant and \E[e {Pa,b, a{xi, . . . ,xe))]\< e. 
The only modification needed to the above proof is that the transformation from A to A' can be 
omitted. 
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To show equidistribution of {Pi{Lj{xi, . . . , x^)), we can use Theorem 3.3 in the same manner we 
used Theorem 3.1 to show the equidistribution of (Pj(x)) in Lemma 3.2. Before we do so, however, 
let us give a name to those A for which the first case of Theorem 3.3 holds. 

Definition 3.6. Given an affine constraint A = (Li, . . . ,Lm) on I variables and integers d,k > 
such that d > k{p— 1), the {d, A;)-dependency set of A is the set of tuples (Ai, . . . , A^) € [0,p^"'"^ — 1] 
such that X^i^i KPiLi{xi, . . . , x^)) = for every polynomial P : F" — > T of degree d and depth k. 

Theorem 3.3 says that if is a regular factor, Pa,b,a = exactly when the first condition holds. 
In other words: 

Corollary 3.7. Fix an integer C > 0, tuples {di, . . . , dc) € Z^q and (/ci, . . . , kc) G ^^'^ 
affine constraint (Li, . . . on i variables. For i € [C], let Aj be the {di,ki)- dependency set of 

A. 

Then, for any polynomial factor B = (Pi, . . . , Pc), where each Pi has degree di and depth ki, 
and B has rank > r2 \2 (niaxjdj, it is the case that a tuple {\i^j)i^^c],je[m] satisfies 

C m 

'^^KjPii^jixi^ . . . = 

i=l j=l 

if and only if for every i G [C], (Aj^i (mod p^^^^), . . . , Aj^m (mod p^^~^^)) G Aj. 

Proof. The "if" direction is obvious. For the "only if" direction, we use Theorem 3.3 to conclude 
that if J AjjPj(Lj(-)) = 0, it must be that for every i G [C], K,jQi{Lj{-)) = for any 
polynomial Qi with degree di and depth ki. This is equivalent to saying (Aj^i (mod . . . , Aj^m 

(mod G Ai. □ 

Remark 3.8. For large characteristic fields, Hatami and Lovett [HLlla] showed that the analog 
of Corollary 3. 7 is true even without the rank condition. 

The distribution of (Pj(-Lj(xi, . . . is only going to be supported on atoms which respect 

the constraints imposed by dependency sets. This is obvious: if P is a polynomial of degree d and 
depth k, (Ai, . . . , Am) are in the (d, /c)-dependency set of (Li, . . . , Lm)-, and P{Lj{xi, . . . , x^)) = bj, 
then Xjbj = 0. We call atoms which respect this constraint for all Pj in a factor consistent. 
Formally: 

Definition 3.9 (Consistency). Let A be an affine constraint of size m. A sequence of elements 
bi,...,bm G IT are said to be (d, A;)-consistent with A ifbi,...,bm G Ufc+i and for every tuple 
(Ai, . . . , Am) in the {d, k)- dependency set of A, it holds that YliLi ^i^i = 0- 

Given vectors d = {di, . . . ,dc) G Z^^g and k = [ki, . . . ,kc) G Z^q, a sequence of vectors 
bi, . . . ,bm G T*^ are said to be (d, k)-consistent with A if for every f G [C], the elements bi^i, . . . , bm,i 
are (di,ki)- consistent with A. 

If B is a polynomial factor, the term S-consistent with A is a synonym for (d, k)-consistent 
with A where d = (di, . . . , dc) and k = (ki, . . . , kc) are respectively the degree and depth sequences 
of polynomials defining B. 

Now, the proof of equidistribution of {Pi{Lj{xi, . . . ,xe)) is straightforward. 
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Theorem 3.10. Given e > 0, let B be a polynomial factor of degree d > 0, complexity C, and 
rank r^ i{d,e), that is defined by a tuple of polynomials Pi,...,Pc : F" ^ T having respective 
degrees di, . . . ,dc and respective depths ki, . . . ,kc- Let A = (Li, . . . , Lm) be an affine constraint 
on £ variables. 

Suppose 6i, ... ,6m G T*^ are atoms of B that are B-consistent with A. Then 



nf=iiA. 

Il-Blh 



Pr [B{Lj{xi,...,xe)) = bj Vj G [m]] 
xi,...,xe 

where Aj is the (di,ki)- dependency set of A. 
Proof. The proof is similar to that of Lemma 3.2. 

Pr [n{Lj{xi, . . . , Xl)) = bij Vf G [C], Vj G [m]] 



±e 



E 

Xl,...,X( 



1 

n:;SI+T Yl ^i>^i,jiPiiLjixi,...,xe)) -bij)) 



eni,j[o,p'=»+^-i] 



P 



,xe) 



where the last line follows because by Corollary 3.7, ^ XijPi{Lj{-)) is identically zero for Hi I A* I 
many tuples (Ajj) and, in that case, Yli j ^ij^ij — because of the consistency requirement. For 
any other tuple (Ajj), the expectation in the third line is bounded by e in absolute value. □ 



4 Degree-structural Properties 

In this section, we prove Theorem 1.7 in the introduction stating that if V is degree-structural 
(recall Definition 1.6), then V is locally characterized. The proof uses many of the tools established 
in Section 3. 



Theorem 1.7 (restated). Every degree- structural property with bounded scope and max-degree is 
a locally characterized affine- invariant property. 

Proof. Let V he a degree-structural property with scope a and max-degree A. Denote by 5 the 
set of tuples (c, d, F) such that c ^ a and V is the union over all (c, d, F) G 5 of (c, d, F)-structured 
functions. It is clear that V is affine-invariant, as having degree bounded by a constant is an affine- 
invariant property. It is also immediate that V is closed under taking restrictions to subspaces, 
since if F is (c, d, F)-structured, then F restricted to any hyperplane is also (c, d, F)-structured. 
The non-trivial part of the theorem is to show that the locality is bounded. In other words we need 
to show that there is a constant K such that for n ^ ET, if F : ^ T is a function with F\a& V 
for every hyperplane ^ ^ F", then F &V. 
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First, let us bound the degree of F. We know that F\a^ V for every hyperplane A. Therefore, 
deg{F\A_) ^ paA for every A, as F\a is a function of at most a polynomials each of degree at most 
A over a field of characteristic p. It follows that F itself is of degree ^ paA. 

Let r : Z>o — )• Z>o be a function to be fixed later. Define r2 : Z>o — > Z>o so that r2{m) > 

^(^1^ V + ^)) + 4?rr V + + p. 

We apply Lemma 2.19to {F} to find an r2-regular polynomial factor B of degree ^ per A, defined 
by polynomials Ri, . . . , Rc : ^ T, where C < C^^iqC^)- 

Since F is measurable with respect to 
B, there exists a function S : T*^ — )■ F, such that F{x) = E(i?i(x), . . . , Rc{x)). 

From each Ri pick a monomial with degree equal to deg(i?i) and a monomial (possibly the 
same one) with depth equal to depth(i2i). By taking K to be sufficiently large, we can gaurantee 
the existence of an iq € [n] such that Xi^ is not involved in any of these monomials. Consequently 
deg{R'^) = deg(i?j) and depth(i?0 = depth(i?j) for all z G [C], where R[, . . . , R'q are the restrictions 
of . . . ,Rc, respectively, to the hyperplane {xi^ = 0}. Also by Lemma 2.13, R[, . . . ,R'u have 
rank > r2{C) — p. Since F\rE^^=Q^ V, by definition of V, there must exist (c, d, F) € S with c ^ a 
such that 

T.{R'i, . . . , R'c) = F(Pi, . . . ,Pc), 

where deg(Pi) ^ di for all i G [c]. 

Now, apply Lemma 2.19 to find an r-regular refinement of the factor defined by the tuple of 
polynomials (i?^, . . . , R'q, Pi, ... , Pc). Because of our choice of r2 and the last part of Lemma 2.19, 
we obtain a syntactic refinement of {R'l, ■ ■ ■ ,R'(j}- That is, we obtain a tuple B' of polynomials 
R[, . . . , R'q, Si, . . . , Sd : F"' ^ T such that it has degree ^ po'A, its rank > r(C + D), and 
C + D ^ ^2*^19 ~'~ ^^'^ each i £ [c]. Pi = Ti{R[, . . . , R'^, Si, ... , Sd) for some function 
Ti : T^+^ T. So for ah x € F", 

^{R[{x),...,R'c{x)) = 

F(Fi(i?;(x), . . . , R'cix), Si{x),..., Sd{x)), r,{R[{x), R'^x), Si{x), . . . , Sd{x))). 

Applying Lemma 3.2, we see that if the rank of B' is > r3 2 ipo'A,e) where e > is sufficiently 
small (say e = ||;B'||/2), then {R[{x), . . . , R'(j{x), Si{x) , . . . , S d{x)) acquires every value in its range. 
Thus, we have the identity 

i;(ai, ... ,ac) = F(Fi(ai, . . . , ac, • • • , &d), • • • ,Fc(ai, . . . ,ac,hi, . . . ,hD)), 

for every Oj G U(jepth{_R')+i hi G Udepth(5i)+i- Thus, we can substitute Ri for R[ and for Si in 
the above equation and still retain the identity 

F{x) = ^{Ri{x),...,Rc{x)) 

= F(Fi(i?i(x), . . . , iic(x), 0, . . . , 0), . . . , F,(i?i(x), . . . , i?c(x), 0, . . . , 0)) 
= F(Qi(x),...,Q,(x)) 

where : F" ^ T are defined as Qi{x) = Fj(i?i(x), . . . , i?c'(x), 0, . . . , 0). Since for every i, 
deg{Ri) = deg(i?9 depth(i?j) = depth(i?9, we can apply Theorem 4.1 below to conclude that 
deg(Qi) ^ deg(Pj) ^ di for every i G [c], as long as the rank of B' is > i{paA). Finally, we 
show that Qi, . . . ,Qc map to Ui = l{F) and, so, are classical. Indeed, since Pi, . . . ,Pc are classical, 
Fi, . . . , Fc must map to l{F) on all of H^^i Udcpth(K9+i x UiLi Udepth{S0+i ^ 11^=1 Udepth(iJO+i ^ 
{0}-^. Hence, F £V. □ 
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The following theorem, used in the proof above, shows that a function of a high rank collection of 
polynomials has the degree one would expect. Thus, it displays yet another way in which high-rank 
polynomials behave "generically" . The proof is via another application of the near-orthogonality 
result in Theorem 3.3. 

Theorem 4.1. For an integer d > 0, let Pi, ... , Pc : F" — )■ T 6e polynomials of degree ^ d and 
rank > i{d), and let F : T*-" T be an arbitrary function. Define the polynomial F : F" — )• T by 
F{x) = r(Pi(x), . . . .,Pc{x)). Then, for every collection of polynomials Qi, . . . Qc : F" ^ T with 
deg((5j) < deg(Pj) and depth((5j) ^ depth(Pj) for all i G [C], if G : ^ T is the polynomial 
G{x) = T{Qi{x),...,Qc{x)), it holds that deg(G) ^ deg(F). 

Proof. Let f{x) = e(F(x)) and j{xi, . . . , xc) = e {T{xi, . . . , xc))- Let D = deg(F). Then, for 
every x,yi,.. .,yD+i G F", 

We need to show that g{x) = e {G{x)) also satisfies Ay^_^^ • • • Ay-^g{x) = 1. 

Let ki, . . . ,kc be the depths of Pi, ... , Pc, respectively. Then, each Pi takes values in Ufc^+i. 
Let S denote the group Zpfci+i x • • • x Z^fe^j+i. Considering the Fourier transform of 7, we have 



fix) = 7(Pi(x), . . . , Pc{x)) = ^ 7(/3)e f |3^P^{x)\ . 

/3eE \j=i / 



Next, we look at the derivative. 



\/3GS \i=l / 



= E n iMeii-lf^^'^aj.pAx + Y^y, 

aj£-E:JC[D+l]JC[D+l] \ i=l \ j£J 

Denoting 5{a) = Hjcp+i] li^tj) « = ("j) Jc[d+i] G we have 

A,,^, . . . A, j(x) = '^(«>(E E (-i)i^i+w.^J^+Ey^l )• (5) 

a(iT.Ti.[D+l]) \i=l JC[D+1] \ jeJ 



For any i, if there is a J such that | J|> deg(aj^jPj), we can use Eq. (3) to rewrite aj^iPi {^x + YljeJ Vj 
as a linear combination (over Z) of |Pj + Yljej' Vj^ '■ \ l-^l^- We repeat this process until 

for every i and J, either aj^i = or | deg(aj,jPj). Denoting by A the set of a € S^^t^^-'^]) that 
satisfy this condition, we have obtained a new set of coefficients 5' {a) such that 



c 



^VD+i ■ ■ ■ ^yif{x) = Y E E "-^'^^^ + E 

ai^A \i=l JQ[D+1] \ jeJ 
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Now, the crucial observation is that if instead of Pi, ... , Pc, we had Qi, . . . , Qc, the same decom- 
position apphes. 



c 



(6) 



i=l JC[D+1] 



The reason is that Eq. (5) remains vahd as is if / is replaced by g and the Pj's are replaced by Qj's, 
and furthermore since deg(Pj) ^ deg(Qj) and depth(Pj) ^ depth(Qi), the applications of Eq. (3) 
remain valid also. Therefore, Eq. (6) is also valid ^. 

But now, we argue that S'{-) are uniquely determined. Let k = maxj fcj ^ d/{p — 1). 

Claim 4.2. If Pi, ... , Pc are of rank > r2_i2id, ^/\A\) + 1, the functions 

c 



a 



J=l JC[D+1] 

are linearly independent over C. 



Proof. Note that all these functions have L^-norm equal to 1. Hence it suffices to show that their 
pairwise inner products are all bounded in absolute value by 1/|^|. To prove this consider a, /3 € A, 
and note that by Theorem 3.3 and, in particular. Remark 3.5, unless all the aj^i — (3j^i are zero. 



E 



c 



i=l JC[D+1] 



< 



\A\' 



□ 



Therefore, since A. 



yo+i 



Ay^/(x) = 1, we must have 6' (a) = 1 when a is the all-zero tuple, 



and 5'{a) = for every nonzero a. Plugging into Eq. (6), we get Aj^^ • • • Ayj^g{x) = 1. 



□ 



5 Property Testing 

5.1 Decomposition Theorems 

Decomposition theorems are a major class of theorems in additive and extremal combinatorics. 
These are statement that tell us that a function / with certain properties can be decomposed 
as a sum Yli=i9ij where the functions gi have certain other properties. We have already seen 
a decomposition theorem in Theorem 2.12: if a polynomial P : F" — > T of degree ^ d satisfies 
||e(P)||^£i^ e, then there exists a factor B of complexity ^ '^2.12(^'^) such that P is a function of 
the polynomials defining B. 

In this section, we discuss decomposition theorems of a particular type called approximate 
structure theorems. These are results that say that, under appropriate conditions, we can write a 
function / as /i + /2, where /i is "structured" in some sense, and /2 is "quasirandom" . The rough 
idea is that the structure of /i is strong enough for us to be able to analyze it reasonably explicitly, 

®Note that in Eq. (6), one could have nonzero aj^i and | J|> deg{aj^iQi), for A = {aj)jc[D+i] with 5'{A) 7^ 0. 
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and the quasirandomness of /2 is strong enough for many properties of fi to be unaffected if we 
"perturb" it to / = /i + /2. Often, in order to obtain stronger statements about the structure and 
the quasirandomness, one ahows also a small L^-error: that is, one writes / as /i + /2 + /s with fi 
structured, /2 quasirandom, and /s small in . 

The Strong Decomposition Theorem below shows that any Boolean function can be decomposed 
into the sum of a conditional expectation over a high rank factor, a function with small Gowers 
norm, and a function with small L^-norm. 

Theorem 5.1 (Strong Decomposition Theorem; Theorem 4.4 of [BFL12]). Suppose 5 > and 
d ^ 1 are integers. Let r] : N ^ M"^ be an arbitrary non-increasing function and r : N ^ N 6e 
an arbitrary non- decreasing function. Then there exist N = i{6, rj, r, d) and C = C5 \ {6, rj, r, d) 
such that the following holds. 

Given f : ^ {0,1} where n > N, there exist three functions /i,/2,/3 : F*^ — )• M and a 
polynomial factor B of degree at most d and complexity at most C such that the following conditions 
hold: 

(^) / = /l + /2 + /3. 

fi = nm]. 

(iv) ll/slb^ 6. 

(v) fi and fi + /s have range [0, 1]; /2 and /s have range [—1, 1]. 

(vi) B is r -regular. 

It turns out though that this Strong Decomposition Theorem is not quite sufficient for the 
purpose of this paper. The issue is that the bound on /s above is a constant 5. Ideally, we 
would want 5 to decrease as a function of the complexity of the polynomial factor, but such a 
decomposition theorem is simply not true. However, analogous to what it is shown in [AFKSOO] in 
the context graphs, here one can find two polynomial factors B' '^syn B such that, the structured 
part /i equals to E[/|;B'], but now the L^-norm of /s can be made arbitrarily small in terms of the 
complexity of the coarser factor B. Furthermore for most atoms c of B, the function / : F" — )• {0, 1} 
have roughly the same density on c and most of its subatoms in B'. To make this precise, we make 
the following definition. 

Definition 5.2 (Polynomial factor represents another factor). Given a function / : F" — > {0, 1}, 
a polynomial factor B' that refines another factor B and a real C G (0, 1), we say B' ("-represents B 
with respect to / if for at most ( fraction of atoms c of B, more than Q fraction of the atoms c' 
lying inside c satisfy |E[/|c] — E[/|c']|> C,. 

We can now state the following Super Decomposition Theorem proven in [BFL12]. 

Theorem 5.3 (Super Decomposition Theorem; Theorem 4.9 of [BFL12]). Suppose C > is a real 
and d, Co ^ 1 are integers. Let r/ : N — )• M"*" and 5 : N ^ M'^ be arbitrary non-increasing functions, 
and r : N ^ N 5e an arbitrary non- decreasing function. Then there exist N = 3(5, rj, r, d, C) and 
C = 3(5, rj, r, d, C) such that the following holds. 

Given / : F" ^ {0, 1} where n > N, there exist functions /i,/2,/3 : F" — ?> M, and polynomial 
factors B' ^syn 13 of degree at most d and of complexity at most C , such that the following conditions 
hold: 

(l) / = /l + /2 + /3. 
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(ii) h = nm']. 

(rn) ll/2||[/.+i^r?(|i3'|). 

M \\hh^5{\B\). 

(v) fi and fi + /s have range [0, 1]; /2 and /s have range 

(vi) B and B' are both r -regular. 

(vii) B' (^-represents B with respect to f. 



-1,1]. 



Although the above Super Decomposition Theorem may be useful by itself for other applications, 
we will need a particular variant. The factor B' is a syntactic refinement of B, and thus is defined 
by adding new polynomials Qi, . . . ,Q\i3'\-\b\ to t^i^ polynomials defining B. Then for each atom c 
in the coarser factor B we will select one atom c' of B' such that the following hold: 

• There is a fixed s G T'^ I^''^' such that for every atom c in ;B its corresponding atom c' is 
obtained by requiring (Qi, . . . , Q\b'\-\b\) to be equal to s. 

• The L^-norm of /s conditioned inside every such atom (i.e., E2.gc/[|/3(x)p]) is small. 

• Most subatoms c' will "well-represent" (in the sense of Definition 5.2) their corresponding 
atoms c from B. 

Before stating this formally, let us also take this opportunity to remark that it is possi- 
ble to adapt the proofs of the above decomposition theorems to decompose several functions 
f^^\...,f^^^ : F" — > {0,1} simultaneously. Alternatively, this could be thought of as decom- 
posing a single vector-valued function / : F" — )• {0, 1}^. Now we finally state the decomposition 
theorem that we will use in the proof of our main result. 

Theorem 5.4 (Subatom Selection; Theorem 4.12 of [BFL12]). Suppose ^ > is a real and d,R^l 
are integers. Let r/, (5 : N — ?• be arbitrary non-increasing functions, and let r : N ^ be an 
arbitrary non- decreasing function. Then, there exist C = Cr^ ^{5,r],r,(^, R) such that the following 
holds. 

Given f^^\. . . , /(^) : F" ^ {0, 1}, there exist functions f?J^\f^^ : F" ^ M for all i e [R], 
a polynomial factor B of degree d with atoms denoted by elements o/T'^', a syntactic refinement 
B' ^syn B of degree d with complexity at most C and atoms denoted by elements o/TI^I x \-\^\^ 
and an element s € T'^ \~\^\ such that the following is true: 

(ii) fP = E[/(*)|S'] for every i e [R]. 

(in) \\f2^\\u''+^< r]{\B'\) for every i E [R]. 

(iv) For every i ^[R\, f^ and f^^ + /g*^ 

(v) B and B' are both r -regular. 

(vi) For every atom c G T''^' of B, the subatom c' = (c, s) G t''^'' satisfy 



+ for every ie[R]. 



have range [0, 1], and f!^^ and f^ have range [—1, 1]. 



E 



«|2 



c,s) <5{\B\ 



for every i G [R]. 
( vii ) If c is an atom of B chosen uniformly at random, then 



Pr 



max 

ie[R] 



E 



/ 



E 



27 



5.2 Big Picture Functions 

Suppose we have a function f :¥"' [R], and we want to find out whether it induces a particular 
affine constraint {A, a), where A = (Li, . . . ,Lm) is a sequence of affine forms on i variables and 
cr € [R]"^. Now, suppose F" is partitioned by a polynomial factor B defined by polynomials 
Pi, ... , Pc of degrees di, . . . ,dc and depths ki, . . . ,kc- Then, observe that if 61, . . . , 6^ ^ 
denote the atoms of B containing Li{xi, . . . ,X£), . . . ,Lm{xi, . . . ,X() respectively, it must be the 
case that 61, . . . , 6m are j8-consistent with A (as defined in Definition 3.9). Thus, to locate where / 
might induce (^,cj), we should restrict our search to sequences of atoms consistent with A. 

It will be convenient to "blur" the given function / so as to retain only atom-level information 
about it. That is, for every atom c of B, we will define /b(c) C [R\ to be the set of all values that 
/ takes within c. 

Definition 5.5. Given a function / : F" — >■ [R\ and a polynomial factor B, the big picture function 
of f is the function fe : T'^I ^ V{[R]), defined by fsic) = {f{x) : B{x) = c). 

On the other hand, given any function g : T*^ — )• 'P{[R]), and a vector of degrees d = (di, . . . , dc) 
and depths k = {ki, . . . ,kc) (which we think of as corresponding to the degrees and depths of some 
polynomial factor of complexity C), we will define what it means for such a function to "induce" 
a copy of a given constraint. 

Definition 5.6 (Partially induce). Suppose we are given vectors d = (di, . . . ,dc) € Z^^q and k = 
(/ci, . . . , kc) G Z^O' ^ function g : JliGlC] ^fei+i ~^ "^([-^l); '^'^'^ '^'^ induced affine constraint {A, a) 
of size m. We say that g partially (d, k)-induces {A, a) if there exist a sequence bi, . . . ,bm € Tr*-^ 
that is (djli)- consistent with A, and aj G gibj) for each j E [m]. 

Definition 5.6 is justified by the following trivial observation. 

Remark 5.7. /// : F" — t- [R] induces a constraint (A, a), then for a factor B defined by polynomials 
of respective degrees (di, . . . , d|g|) = d and respective depths (A:i, . . . , A;|g|) = k, the big picture 
function partially {d,\s.)-induces {A, a). 

To handle a possibly infinite collection A of affine constraints, we will employ a compactness 
argument, analogous to one used in [AS08b] to bound the size of the constraint partially induced 
by the big picture function. Let us make the following definition: 

Definition 5.8 (The compactness function). Suppose we are given positive integers C and d, and a 
possibly infinite collection of induced affine constraints A = {{A^^^ cr^), (^4^, cx^), . . . }, where {A^ 

d-i 



is of size mi. For fixed d = (di, . . . ,dc) € [d]*^ and k = [ki, . . . , kc) € 0, , denote by 



^(d, k) the set of functions g : Y[f=i Ufc,;+i — )• 'P{[R]) that partially (d, ]i)-induce some (A', a*) € A. 
The compactness function is defined as 

^^{C, d) = max max min mi 

d,k geg{d,k) (Ai,ai) partially 
(d, k)-induced by 9 

where the outer max is over vectors d = {di, . . . , dc) G [d] and k = (fci, . . . , kc) G ^ ^ 



Whenever t?(d, k) is empty, we set the corresponding maximum to 0. 



p-i 
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Note that ^'^(C, d) is indeed finite, as the number of possible degree and depth sequences are 
bounded by cP'-^ , and the size of ^(di, . . . , dc) is bounded by 2^^ . 

Remark 5.9. Note that if a function g : T*-" V{[R]) partially {d,li) -induces some constraint 
from A where d € [d]'^ , then g must belong to ^(d,k), and consequently it will necessarily partially 
induce some (A*,(T*) G A whose size is at most '^^{C,d). 



5.3 Proof of Testability 

We prove the main result, Theorem 1.13, in this section. In fact, we will show the following. 

Theorem 5.10. Let d > be an integer. Suppose we are given a possibly infinite collection 
of affine constraints A = {(A^, o"^), (A^, cr^), . . . } where each (A*,it*) is an affine constraint of 
complexity ^ d, and of size rui on £i variables. Then, there are functions i_A '■ (0, 1) Z>o and 
5_/( : (0, 1) (0, 1) such that the following is true for any e € (0, 1). If a function / : F" — > [i?] 
is e-far from being A-free, then f induces at least (5^(e)p"^* many copies of some {A^jCr"^) with 
li < iA{e) ■ 

Moreover, if A is locally characterized, then is a constant independent of e. 

Theorem 1.13 immediately follows. Consider the following test: choose uniformly at random 

xi, . . . , G F", let H denote the afhne space + X]^=2^^ ^j^j ■ ^ ^^ji check whether 

/ restricted to H is ^-free or not, thus making ^ p^-a(£) queries. By Theorem 5.10, if / is e-far 
from .A-freeness, this test rejects with probability at least d^{e). 



Proof of Theorem 5.10: 



Preliminaries. Fix a function / : F" ^ [R] that is e-far from being ^-free. For i €z [R], define 
: F" — > {0, 1} so that equals 1 when f{x) = i and equals otherwise. Additionally, set 

the following parameters, where is the compactness function from Definition 5.8: 

aiO = p-^dC^A{C4)^ p^c) = r2.i2(d,a(C)), C = ^, 



Decomposing by regular factors. Next, apply Theorem 5.4 to the functions f^^\ f^'^\ ■ ■ ■ , f^^^ 
in order to get polynomial factors B' ^syn B of complexity at most C5 4(A, d, p, r/), an element 
s € tI'^'I^I'^I, and functions /}*\ /2*\ /a*^ : F" ^ M for each i € [R] with the desired properties. 
The sequence of polynomials generating B' will be denoted by Pi, ... , P\b'\- Since B' is a syntactic 
refinement, we can assume B is generated by the polynomials Pi, . . . , P\i3\. Let C = \B\ and 
C = \B'\. Note that \\B\\< pC^max+i)*^ ^ pdc ^ where fcmax ^ ~ 1)J is the maximum 

depth of a polynomial in B. Denote the degree of Pi by di and the depth of Pi by ki. 

Cleanup. Based on B' and B, we construct a function F :¥"' ^ [R] that is |-close to / and hence, 
still violates ^-freeness. The "cleaner" structure of F will help us locate the induced constraint 
violated by /. 
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The function F is the same as / except for the fohowing: For every atom c of B, let tc = 
arg maxjg[j:j] Pr[/(x) = j \ B'{x) = (c, s)] be the most popular value inside the corresponding 
subatom (c, s). 

• Poorly-represented atoms: If there exists i € [R] such that |Pr[/(x) = i \ B{x) = c] — 
Pr[/(3;) = i \ B'{x) = (c, s)]|> C,, then set F{z) = tc for every z in the atom c. 

• Unpopular values: Otherwise, for any z in the atom c with < Pr^[/(x) = f{z) \ B'{x) = 
(c,s)] < C, set F{z) = tc. 

A key property of the cleanup function F is that it supports a value inside an atom c of i3 
only if the original function / acquires the value on at least an C, fraction of the subatom (c, s). 
Furthermore as the following lemma shows it is e/2-close to /, and therefore, it is not ^-free. 

Lemma 5.11. The cleanup function F is e/2-close to f , and therefore, it is not A-free. 

Proof. The first step applies to at most CI I 1 1 atoms, since B' C-represents B with respect to each 
f^^\ . . . , f^^\ By Lemma 3.2, each atom occupies at most ]|;^ + '^(C) fraction of the entire domain. 
So, the fraction of points whose values are set in the first step is at most CII'S||(|p|| + a(C)) < 2^. 

In the second step, if Pr[/(x) = f{z) \ B'{x) = (c, s)] < C, then Pr[/(x) = f{z) \ B{x) = c] < 
Pr[/(x) = f{z) I B'{x) = (c, s)] + C < 2C. Hence, the fraction of the points whose values are set in 
the second step is at most 2C,R = e/A. 

Thus, the distance of F from / is bounded by 2^ + e/4 < e/2. □ 

Locating a violated constraint. We now want to use F to "find" the affine constraint induced 
in /. Setting d = (di, . . . , dc) and k = {ki, . . . ,kc), we have by Remark 5.7 that the big picture 
function F^ of F will partially (d, k)-induce some constraint from A, and hence by Remark 5.9, it 
will partially (d, k)-induce some {A, a) G ^ of size m ^ ^'^(C, d) on I variables. We will show that 
the original function / violates many instances of this constraint. 

Denote the affine forms in A by (Li, . . . , L^) and the vector a by (cJi, . . . , am)- Since we can 
assume i ^ m (without loss of generality by making a change of variables), we can now define 

£A{e) = ^A{C5Ai^^V,P,C,R),d). (7) 

Let bi,...,bm G rii^i correspond to the atoms of B where [A,(7) is partially (d,k)- 
induced by Fq. That is, 6i, . . . ,6^ are consistent with A, and ai G F^^bi) for every i G [m]. Also, 
let b[, . . . ,b'^ G rii^i Ufej+i index the associated subatoms in B', obtained by letting b'j = {bj, s) for 
every j G [m] . 

Lemma 5.12. The subatoms b'^, ... ,5^ are consistent with A. 

Proof. Since 6i, ... ,6m are already consistent with A, we only need to show that for every i G 
[C + 1, C], the sequence {b'^ j, . . . , 6^ J = (sj_c, Sj-C) ■ ■ ■ , Si-c) is (dj, fci)-consistent. This holds 
because a constant function is of degree ^ dj. □ 
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The main analysis. Let x = (xi, . . . ,X£) where xi,...,X£ are independent random variables 
taking values in F" uniformly. Our goal is to prove a lower bound on 



Pr [/(Li(x)) = A • • • A /(L^(x)) = a^] = E f^^'\L^{^)) ■ ■ ■ /^'^-^(L™(x)) 



(8) 



The theorem obviously follows if the above expectation is larger than the respective We 
rewrite the expectation as 



E 

X 



We can expand the expression inside the expectation as a sum of 3™ terms. The expectation 
of any term involving /2'^''^ for any j € [m] is bounded in magnitude by ||/2°^^^||c/d+i^ ri{\B'\), by 
Lemma 1.12 and the fact that the complexity of A is bounded by d. Hence, the expression (9) is 
at least 



E 



(^K) ^ /(-i))(L^(x)) . . . {fl'^-> + /f-^)(L„(x)) - 3-r/(|S'|). 



Now, because of the non-negativity of /['^^^ + fff^'^ for every j € [m], this is at least 



E 

X 



3^vi\B'\), (10) 



ie[m] 

where Ij^/^^.^^^^/] is the indicator function of the event B'{Lj{x.)) = h'-. In other words, now we 

are only counting patterns that arise from the selected subatoms h'^^, . . . ,h'^. We next expand the 
product inside the expectation into 2™ terms. We will show that the contribution from each of the 
2™ — 1 terms involving f'^''^ for any k € [m] is small. Such a term is trivially bounded from above 

by 



E 

X 



(11) 



fr\L,{^))\ n i[ 

Without loss of generality, we assume that k = \. This is convenient as by Definition 1.8 (i) we 
have -^^i(x) = xi. (For other values of k, we can do a change of variables, replacing xi with Lfc(x), 
so that we can assume -Zvfc(x) = xi.) With the assumption = 1, the square of (11) is equal to the 
following. 

2 / \ 2 



E 



< E 



E 

XI \ X2,...,Xf 



By Theorem 5.4 (vi) and Lemma 3.2, we have 



(12) 



E 

XI 



l/^^(^i)PV(L,)=.l,]l ^A2(C)Pr[^'(xi) = 6lKA2(C)(^ + a(C')') (13) 



WW J \w.. 

Let y = (2/2 5 • • • ^yi) where y2, ■ ■ ■ ,ye are independent random variables taking values in F" uni 
formly. The second term in the right hand side of (12) is equal to 

r / \ 2 

X2,...,Xi 



\\B' 



'/||2m 



n ^ E e(A,,.(P.(L;.(x))-6^,)) 



iG[C': 
iS[m] 



A,,,=0 
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1 



i\\2m 



E 



(^i,,)6 



V 



ie[c'] 



1 



7||2m 



E 



(^»,j).(^i,j)6 

ni,j-[o,p'=i+i-i] 



E 



E e 

X2,...,X£ 



iG[C'] 
je[m] 



\ jS[m] 



^ AijPi(Lj(x)) |e - ^ rijPi(Lj(xi,y)) 

ie[c'] 

j g [m] 



(14) 



We can bound the above using Theorem 3.3. Let A' denote the set of 2m hnear forms: 
{Lj{xi,X2, ...,xi)\j£ [m]} U {Lj{xi,y2, ■■■,yi) lie [m]} in variables xi, ...,xi,y2,.. ■ ,y£. Let 
Aj and A- denote the {di, /ci)-dependency set of A and A' respectively. 

Lemma 5.13. For each i, |A-|= |Aip-^)'^»+^ 

Proof. Recall that ii(x) = Li(a;i,y) = xi. For any A,r € Aj and any a € [0,^^'^"'^ — 1], 
note that (Ai + a (mod A2, . . . , Am, n - a (mod T2, . . . , r^) G A-. Hence, |A-|^ 

|^^|2.pfcj+i^ To show |A.[^ |Ajp-p'^*+^, we give a map from A. to Aj x Aj that is p'^»+^-to-l. Suppose 
YJj=i XjQ{Lj{xi,X2, . . . , Xi)) + J2T=i y2, • • • , yi)) = for every polynomial Q of degree 

di and depth ki. Setting X2 = ■ ■ ■ = xi = shows that 



and similarly setting y2 



'YTjQ{Lj{xi,y2, . . .,ye)) 
= . . . = y£ = shows 



Y.Xj]Q{xi), 



^ AjQ(Lj(xi,X2, . . .,Xi)) 
In particular Xljli = ~ Sj=i '^y Consequently, 

m 

(^'^)^ I I - E^^- (mod A2,..., A, 

J=2 



I Q(xi). 



^Tj (mod/'+^),r2,...,r„ 

i=2 



is a map from Aj to Aj x Aj. To see that it is p^*"''"'^-to-l, note that 

(Ai + Ti - 7 (mod A2, . . .,Xm,7,T2, . . . ,rm) G A- 

for every 7 G [0,^'^'^"'^ — 1], and these elements are all mapped to the same element in Aj x Aj. □ 
Applying Theorem 3.3 (just as in the proof of Theorem 3.10), we get that 



(14)^ 



1 



n£iiA.p 

||g/||2m 



7||2m 



n|Aj| + llS'f "a(C') 



vt=l 



+ a{C'). 
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Combining this with Eq. (13) and Eq. (12), we obtain 



:ii)^2A(c)Wii|iLA + «(co. 



(15) 



Finally, we turn to the main term in the expansion of Eq. (10). We know from Lemma 5.12 
that the subatoms 5'^, . . . , 5^ are consistent with A. Thus 



E 



[B'{L,)=br] 



/r^(^i(x))---/r"^(^-(x))- n 1 

Pr[S'(Li(x)) = 6'i A • • • A S'(L„(x)) = b'J 



E 



/^)(Li(x)) • • • /{'^-^(L^(x))|Vj G [m], ^'(L,(x)) = b'^ 

«(c") I r. 



[•(o"m) 



n£iiA. 

1 1 ^/ 1 1 m 



(16) 



Let us justify the last line. The first term is due to the lower bound on the probability from 

Theorem 3.10. The second term in (16) follows since each f^'^^^ is constant on the atoms of B', and 

because by construction, the big picture function of the cleanup function F, on which (A, a) 

was partially induced, supports a value inside an atom 6 of S only if the original function / acquires 

the value on at least an fraction of the subatom (c, s). 
c' 

Setting 13 = {I\i^i\^i\/\\B'\\)'^ and combining the bounds from (10), (15) and (16), we conclude 

- 2"^+iA(C7) V/J^ + a{C') - ?r ■ r]{C') 



(8) ^ (/3 - a{C')) • 

/3 / e \^Aic,d) 



> 



2 \8RJ 
1 



^ /3 ^ 1, A(C) = Mmf^^''''^ viC) < ^^^^ 

(17) 



Since \\B'\\i^ p'^^" 



||g/||*yl(C,d) ^ /- — ifciV8_K/ J -IK- I - g||g, 

and both C and C are upper-bounded by C5 4 (A, 7/, i?), we can now define 



V8i? 



to conclude the proof. 



□ 
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